Compositional differences within and between eukaryotic genomes.

نویسندگان

  • S Karlin
  • J Mrázek
چکیده

Eukaryotic genome similarity relationships are inferred using sequence information derived from large aggregates of genomic sequences. Comparisons within and between species sample sequences are based on the profile of dinucleotide relative abundance values (The profile is rho*XY = f*XY/f*Xf*Y for all XY, where f*X denotes the frequency of the nucleotide X and f*XY denotes the frequency of the dinucleotide XY, both computed from the sequence concatenated with its inverted complement). Previous studies with respect to prokaryotes and this study document that profiles of different DNA sequence samples (sample size >/=50 kb) from the same organism are generally much more similar to each other than they are to profiles from other organisms, and that closely related organisms generally have more similar profiles than do distantly related organisms. On this basis we refer to the collection (rho*XY) as the genome signature. This paper identifies rho*XY extremes and compares genome signature differences for a diverse range of eukaryotic species. Interpretations on the mechanisms maintaining these profile differences center on genome-wide replication, repair, DNA structures, and context-dependent mutational biases. It is also observed that mitochondrial genome signature differences between species parallel the corresponding nuclear genome signature differences despite large differences between corresponding mitochondrial and nuclear signatures. The genome signature differences also have implications for contrasts between rodents and other mammals, and between monocot and dicot plants, as well as providing evidence for similarities among fungi and the diversity of protists.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of compositional heterogeneity within and between eukaryotic genomes.

Using large amounts of long genomic sequences, we studied the compositional patterns of eukaryotic genomes. We developed a simple measure, the compositional heterogeneity (or variability) index, to compare the differences in compositional heterogeneity between long genomic sequences. The index measures the average difference in GC content between two adjacent windows normalized by the standard ...

متن کامل

Evolutionary History of the Human Genome

Compositional genomics is an approach to the problem of the organization of eukaryotic genomes. Initially this approach consisted of analysing the base composition of complex eukaryotic genomes (such as the nuclear genomes of vertebrates) by using density gradient centrifugation in the presence of sequence-specific ligands. This approach revealed that vertebrate genomes are mosaics of very long...

متن کامل

Compositional analysis of non-coding regions in eukaryotic genomes

Whereas, in the last years, many efforts have been devoted to locate genes within genomes, relatively few tools have been developed to identify the regulatory regions required for the correct transcriptional activity of the genome. This task is particularly difficult in the case of eukaryotic organisms for which regulatory regions represent a small percentage, overwhelmed by –presumablynon-func...

متن کامل

Mid-Range Inhomogeneity of Eukaryotic Genomes

Multicellular eukaryotic genomes are replete with nonprotein coding sequences, both within genes (introns) and between them (intergenic regions). Excluding the well-recognized functional elements within these sequences (ncRNAs, transcription factor binding sites, intronic enhancers/silencers, etc.), the remaining portion is made up of so-called "dark" DNA, which still occupies the majority of t...

متن کامل

IsoPlotter+: A Tool for Studying the Compositional Architecture of Genomes

Eukaryotic genomes, particularly animal genomes, have a complex, nonuniform, and nonrandom internal compositional organization. The compositional organization of animal genomes can be described as a mosaic of discrete genomic regions, called "compositional domains," each with a distinct GC content that significantly differs from those of its upstream and downstream neighboring domains. A typica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 94 19  شماره 

صفحات  -

تاریخ انتشار 1997